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Hypothesis testing is a fundamental issue in statistical inference and has been a crucial ele- 
ment in the development of information sciences. The Chernoff bound gives the minimal Bayesian 
error probability when discriminating two hypotheses given a large number of observations. Re- 
cently the combined work of Audenaert et al. [Phys. Rev. Lett. 98, 160501] and Nussbaum and 
Szkola quant-ph/0607216 has proved the quantum analog of this bound, which applies when the 
hypotheses correspond to two quantum states. Based on the quantum Chernoff bound, we define a 
physically meaningful distinguishability measure and its corresponding metric in the space of states; 
the latter is shown to coincide with the Wigner-Yanase metric. Along the same lines, we define 
a second, more easily implementable, distinguishability measure based on the error probability of 
discrimination when the same local measurement is performed on every copy. We study some gen- 
eral properties of these measures, including the probability distribution of density matrices, defined 
via the volume element induced by the metric, and illustrate their use in the paradigmatic cases of 
qubits and Gaussian infinite-dimensional states. 

PACS numbers: 03.67.Hk, 03.65.Ta 



I. INTRODUCTION 



About fifty years ago Herman Chernoff proved his fa- 
mous bound, which characterizes the asymptotic behav- 
ior of the minimal probability of error when discrimi- 
nating two hypothesis given a large number of observa- 
tions Its quantum analog was recently conjectured Q 
and finally proven by combining the results of two recent 
publications [1, 0| ■ In this quantum setting one is con- 
fronted with the problem of knowing the minimum er- 
ror probability in identifying one of two possible known 
states of which N identical copies are given. Hereafter 
we will refer to this minimum simply as the error prob- 
ability P c . This problem is widely known as quantum 
state discrimination 1 . Its difficulty (but also its appeal) 
lies in the fact that quantum mechanics only allows for 
full discrimination of such states when they are orthog- 
onal. This has both fundamental and practical impli- 
cations that lie at the heart of quantum mechanics and 



bee |3 and for two reviews on the recent and more historical 
developments of this field respectively. 



its applications. 

For these past fifty years the classical Chernoff bound 
— as well as hypothesis testing in general — has proved to 
be extremely useful in all branches of science. Likewise, 
one would expect its quantum version to be far more 
than a mere academic issue. The characterization and 
control of quantum devices is a necessary requirement for 
quantum computation and communication, and quantum 
hypothesis testing is specially designed for assessing the 
performance of these tasks. Particularly important ex- 
amples for which state discrimination plays an essential 
role are quantum cryptography Q, classical capacity of 
quantum channels Q, or even quantum algorithms [?|. 
Equally important are some new theorems concerning 
different quantum extensions of hypothesis testing: the 
quantum Stein's lemma, proved some years ago [lij, [ll[ , 
and t he q uantum Hoeffding bound, recently established 

in miialiS- 

In this paper we study the classical and the quan- 
tum Chernoff bounds in connection to measures of distin- 
guishability for quantum states, putting special emphasis 
on the qubit and Gaussian cases. We start by reviewing 
classical and quantum hypothesis testing and the corre- 
sponding Chernoff bounds in Sec. Ulland Sec. Mil respec- 
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tively (the latter includes the before mentioned recent 
results by Nussbaum and Szkola [H and Audenaert et al. 
0). In SecED we discuss the notion of a distinguishabil- 
ity measure for quantum states. We briefly motivate an 
important instance of such a notion based on classical sta- 
tistical measures, that is, the quantum fidelity, and move 
to a fully operational alternative, based on the asymp- 
totic rate exponent of the error probability in symmet- 
ric quantum hypothesis testing: the quantum Chernoff 
measure 2 . We also discuss a similar distinguishability 
measure derived from the same rate exponent when the 
decision is based on N identical single-copy (local) mea- 
surements — instead of the collective measurements on 
the iV copies assumed in the derivation of the quantum 
Chernoff bound. In Sec. [V] we study the metrics induced 
by the previously defined measures of distinguishability 
and give explicit expressions for general d-dimensional 
systems. We also give the probability distribution of the 
eigenvalues of a d x d density matrix based on the quan- 
tum Chernoff metric (induced by the corresponding dis- 
tinguishability measure). We find that the metric based 
on local measurements is discontinuous and has to be de- 
fined piecewise: on the set of pure states, where it agrees 
with the Fubini-Study metric, and, separately, on the set 
of strictly mixed states, where it agrees with one-half the 
Bures-Uhlmann metric. The quantum Chernoff metric, 
in contrast, is continuous and smoothly interpolates be- 
tween the Fubini-Study and one-half the Bures-Uhlmann 
metrics. In Sec. I VII we concentrate on the particular 
case of two-level systems and study in some depth the 
differences between the quantum Chernoff measure and 
metric and those based on identical local measurements. 
In Sec. I VIII we give explicit expressions of the quantum 
Chernoff measure and its corresponding induced metric 
for general Gaussian states. Finally, we state our conclu- 
sions in Sec. IVIIII 



II. CLASSICAL HYPOTHESIS TESTING: 
CHERNOFF BOUND 

One of the most fundamental problems in statistical 
decision theory is that of choosing between two possible 
explanations or models, that we will refer to as hypoth- 
esis H and Hi, where the decision is based on a set of 
data collected from measurements or observations. For 
example, a medical team has to decide whether a patient 
is healthy (hypothesis Ho) or has certain disease (hy- 
pothesis Hi) in view of the results of some clinical test. 
Often, Ho is called the working hypothesis or null hy- 
pothesis, while Hi is called the alternative hypothesis. In 
general these two hypotheses do not have to be treated on 
equal footing, since wrongly accepting or rejecting one of 



2 By 'operational' it is meant 'defined though a specific procedure 
or task', in contradistinction to 'purely mathematical'. 



them might have very different consequences. These two 
types of errors, i.e., the rejection of a true null hypothe- 
sis or the acceptance of a false null hypothesis, are called 
type I or type II errors respectively, and their correspond- 
ing probabilities will be denoted by p(l\Ho) = po(l) and 
p(0\Hi) = pi(0) throughout the paper. In our example, 
failure to diagnose the disease is a type II error, whereas 
it is a type I error to wrongly conclude that the healthy 
patient has the disease. Of course it would be desirable to 
minimize the two types of errors at the same time. How- 
ever, this is typically not possible since reducing those 
of one type entails increasing those of the other type. 
Hence, a common way to proceed is to minimize the er- 
rors of one type, while keeping those of the other type 
bounded by a constant (which may depend on the num- 
ber of observations). Another (Bayesian-like) approach 
consists in minimizing a linear combination of the two 
error probabilities P c = Kop(l\Ho) + irip(0\Hi), where 
7To and 7Ti can be interpreted as the a priori probabili- 
ties that we assign to the occurrence of each hypothesis. 
In this paper we consider this latter approach, which is 
known as symmetric hypothesis testing. 

For the sake of simplicity, we assume to start with 
that 7To = 7Ti = 1/2, and we deal with tests that have 
only two possible outcomes, b = 0, 1. This is, for ex- 
ample, the situation that corresponds to the identifica- 
tion of a biased coin that can be (with equal probabil- 
ity) of one of two types: or 1 (corresponding to hy- 
pothesis Ho or Hi respectively). If it is of the type 
the probabilities of obtaining head and tail are respec- 
tively po(0) = p and Po(l) = 1 — P = Pi while if it is of 
type 1 we write pi(0) = q and pi(l) = 1 — q = q. The 
test consists in tossing the coin, which has two possible 
outcomes: either head (b — 0) or tail (b = 1). 

If we can toss the coin only once (single observation) , 
it is easy to convince oneself that the minimum (average) 
probability of error is attained when we accept the hy- 
pothesis (decide that the tossed coin is of the type) for 
which the observed outcome occurs with largest proba- 
bility. Therefore 3 

I 1 

P c = -22mm{p (b),pi(b)} 

6=0 

1 1 

< - min Vpg(6)^- s (6)^P cc , (1) 

2 .e[o,i] fc=0 

where we have used the inequality min{p, q} < p s q 1 ~ s . 



In this formula, as well as in most of the formulas involv- 
ing minimization throughout the paper, one should properly 
write inf sg jQ instead of min s g[ 0|1 ] since the minimum may not 
exist if po and pi (po and pi in the quantum case) are degenerate 
and have different support. This is so because in this case the 
continuity of the argument of rrriiijgmi] in all these equations is 
guaranteed only in the open interval (0, 1) and (end-point) singu- 
larities may occur at s = 0, 1. We will overlook this mathematical 
subtlety in the main text to simplify the exposition. 
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The subscript CC stands for classical Chernoff. This ex- 
pression also holds for tests with more than two out- 
comes. We just need to extend the sum over b to the 
entire range of possible outcomes. In what follows, we 
leave the range of b unspecified whenever an expression 
is valid for an arbitrary number of outcomes. 

Next, let us assume we can toss the coin N 
times. The set of possible outcomes (the sam- 
ple space) is the iV-fold Cartesian product of {0, 1} 
(or {head, tail}). The two probability distributions of 
these outcomes, Pq (b^) and p\ (&W), will be given 
by the product of the corresponding single-observation 
distributions, p\ N ' (b^) = Pi(bi)pi(b 2 ) ■ • -pi(b N ), where 
now = (6i , 62 , ■ ■ ■ , &/v) € {0, l} xjv , and one immedi- 
ately obtains [15| 



1 . 
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^2 P s (b)p{- s (b) 



(2) 



This is the Chernoff bound [l[ . It is specially important 
because it can be proved to give the exact asymptotic rate 
exponent of the error probability, that is, 



Pc 

C(p ,Pi) 



-NC(p ,pi). 



hn log^^(&) Pl - s (6). (3) 

L 1 J L 



mm 

se 



The so-called Chernoff information, or Chernoff 
distance, C(po,P\), can also be written in terms 
of the Kullback-Leibler divergence K(po/pi) — 
E b Po(b)log[ Po (b)/ Pl (b)} 0: 



C(Po,Pi) = K(p s */p Q ) = K(p s */p 1 ), 



where 



Ps{b) 



Po(b)pl S (b) 
^P s (b)pl- S (b)' 



s e [0, 1] 



(4) 



(5) 



is a family of probability distributions known as the 
Hellinger arc that interpolates between p and p\ , and s* 
is the value of s at which the second equality in ((4]) holds. 
In other words, it is the point at which p s is equidistant to 
both po and p\ (in terms of Kullback-Leibler distance). 
It can be shown that s* is also the value of s that mini- 
mizes the right hand side of ([3]). 

For the case of measurements with two outcomes, such 
as the example of the coins discussed above, one can give 
a closed expression for the Chernoff distance, which we 
denote in this binary case as C (p, q) : 



C(p,«)=£log£+flogi 
P P 



with 



log((?/p) 



log{p/p) + log(q/q) 



(0) 



(7) 



The parameter £ has a very straightforward interpreta- 
tion. If No is the number of heads (of 0's) after N trials, 
which according to the distribution po occurs with prob- 
ability 



Wo) 



(8) 



[according to the distribution p\ it occurs with proba- 
bility Pi(Nq), defined the same way but with p replaced 
by q], then £ is the fraction of heads above which one 
must decide in favor of po. That is, if No > £,N one ac- 
cepts hypothesis Ho, while if A^o < £,N one accepts Hi. 
Asymptotically, the contribution to the error probabil- 
ity is dominated by situations where Nq — £N, i.e., by 
events that occur with the same probability for both hy- 
potheses (see Fig. Q]) . The probability of such events is 
clearly a lower-bound to the probability of error. It is 
straightforward to check that — hin^^oo \ogPo(£,N)/N 
[or equivalcntly — liniAr^oo logPi(£iV)/A r ] coincides with 
the upper bound given by the Chernoff distance C(p, q). 
This proves that the Chernoff bound is indeed attainable. 




pN %N qN 



FIG. 1: (Color online) Each curve represents the probability 
to obtain No heads after N tosses of a bias coin that can 
be of one of two types, or 1. The probability that the 
coin of type (1) produces a head at any given toss is p 
(q). For large iV these curves approach Gaussian distributions 
centered at pN and qN, respectively. The point £iV where 
they cross defines the decision boundary (see main text) . The 
error probability is given by the shaded area. 



III. QUANTUM HYPOTHESIS TESTING: 
THE QUANTUM CHERNOFF BOUND 

We now tackle discrimination (symmetric hypothe- 
sis testing) in a quantum scenario. We consider two 
sources, and 1 that produce states described respec- 
tively by the density matrices po and p\ acting on a 
Hilbert space "K. We are given A^ copies of a state p 
with the promise that they have been produced either by 
the source (with prior probability ttq) or by the source 1 
(with prior probability 7Ti = I — ttq)- Accordingly, we can 
formulate two hypothesis (Ho and Hi) about the iden- 
tity (0 or 1, respectively) of the source that has produced 
these copies. We wish to find a protocol to determine, 
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with minimal error probability, which hypothesis better 
explains the nature of the N copies. No matter how 
complicated this protocol might be, it is clear that the 
output must be classical: we have to settle for one of the 
two hypotheses. Therefore the protocol develops in two 
stages. First, to obtain information about the states we 
must necessarily make a (quantum) measurement, which 
in contrast to the classical world is an inherently random 
and destructive process. Second, one has to provide a 
classical algorithm that processes the measurement out- 
comes (classical data) and produces the best answer (H 
or Hi). Quantum mechanics allows for a convenient de- 
scription of this two-step process by assigning to each 
answer, Hq and Hi, a single POVM (positive operator 
valued measure) element Eq and E\ respectively (Eb > 
acts on "K® N ; E + E 1 = 1). The probability that this 
POVM measurement gives the answer Hb conditioned to 
P = Pl is Pi (b) = tx{pf N E b ). 

The problem thus reduces to finding the set of opera- 
tors {Eb}l =Q that minimize the mean probability of error, 
For the simplest case of a single copy (N — 1 ) an d two 
equiprobable hypotheses (7r = tti = 1/2) it is [16| 

P c = \ Ml) +Pi(0)] - \ [tr(po3i)+tr(Pi3>)] ■ ( 9 ) 

Since Eq = 1 — Ex, we can introduce the Helstrom matrix 
r = pi — po, as is common in quantum state discrimina- 
tion, and write 



P e = ~-^(E 1 T), 



(10) 



which only needs to be optimized with respect to E\ . We 
note that T has some negative eigenvalues, as trT = 0. 
This necessarily implies that the minimum error proba- 
bility is attained if E\ is the projector on the subspace 
of positive eigenvalues of T. We will denote this pro- 
jector by {r > 0} and define the positive part of T 
as T + = {T > 0}T. Taking into account that T is trace- 
less, we obtain 



tr(Eir) = trT + 



5tr|r| 



(ii) 



where the matrix \A\ (absolute value of A) is defined to 
be \A\ = VaTA. We arrive at the final result 01, 



1 - ~tr|pi 



Po 



(12) 



The problem of discriminating multiple copies (arbi- 
trary N) is thus formally solved by replacing pi by pf N 
in the above equations. Indeed, if we do not have any 
restrictions on the type of measurements performed on 
the N copies, E x = {pf N - pf N < 0}, and the mean 
probability of error is just 



1 



1 



®N\ 



(13) 



However, the computation of the trace norm of the Helm- 
strom matrix in (fT3"|) is tedious and, moreover, this equa- 
tion provides little information about the large N behav- 
ior of the error probability, which is what the Chernoff 
bound is about. 

The quantum version of the Chernoff (upper) bound 
was presented very recently in Q • There it is shown that 



P c < — min trpnPi 
" 2 sE [o,i] rori 



P 



1 



QC 



(14) 



(the subscript QC stands for quantum Chernoff), which 
holds for arbitrary density matrices. Moreover, this 
bound can be very efficiently computed. 

The bound (fl"4|) is a straightforward application of the 
following theorem 0: 

Theorem 1 Let A and B be two positive operators, then 
for allO < s < 1, 



tr (A S B 



i-^ > -tv(A 
~ 2 v 



B-\A-B\). 



(15) 



The proof of this theorem involves advanced methods in 
matrix algebra and we refer the interested reader to [3J. 
Instead, here we will give a simple proof of the inequal- 
ity (|14[) where instead of minimizing over s, the particular 
value s = 1/2 will be chosen. 

We first notice that one obtains an upper-bound to P c 
by picking any particular positive operator E\ (and, ac- 
cordingly, Eq) in |9j). A convenient choice is E\ = 

{pl /2 - p\ /2 < 0} (and thus E = {p 1 ^ - p\ /2 > 0}), 
where, as above, {^4 > 0} stands for the projector onto 
the subspace spanned by the eigenstates of A with posi- 
tive eigenvalue. After the following series of inequalities 
we arrive to the desired result [l7l ]: 

2P C < ti(E lPo )+ti(E oPl ) (16) 
= tr(^ 2 p y 2 {^ 2 -^ 2 <0}) + 
+tr(p{ /2 p 1 1 /2 {p 1 /2 -p[ /2 >0}) 

^ , i 1/2 1/2 r 1/2 1/2 „-,s 
< tl \Po Pi {Po ~ Pi <°}) + 
. , , 1/2 1/2 r 1/2 1/2 . nl x 

+tr(Po Pi iPo - Pi > °l) 

= ti[py 2 p\' 2 {{py 2 ~ P r<o}Hpl /2 -p\ /2 >m 

= tr(pl /2 P \l\ 
where in the second inequality we have used 

{p\' 2 ~py 2 ){pl /2 -p\ /2 <0} > 0; 
{pl /2 -p\ /2 ){pl /2 -p\ /2 >*} > 0. (17) 

The general proof (for all s) follows the same steps 
but taking E x = {p 1 ^ 3 - p{~ s < 0} if < s < 1/2 
and Ei = {pg — p\ < 0} if 1/2 < s < 1. In this case, the 
inequality analogous to the second one in (TT6"]) requires 
the two additional non-obvious relations 

tT[pl- s (p s -pt){pl- s -pl- s >0}] >0; 0< s <i 
tr^( Pl - s -pS- s ){p? -pi > 0}] > 0; \<8<1. (18) 
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These inequalities follow immediately from the follow- 
ing non-trivial lemma, which constitutes the core of the 
proof 0: 

Lemma 1 Let A and B be two positive operators, then 
for allO<t<l, 



tr[{A -B> 0}B(A t - B t )] > 0. 



(19) 



Before proceeding with the the asymptotic limit, sev- 
eral comments about (TT4")) are in order, (i) The expo- 
nential fall-off of the probability of error when a num- 
ber N of copies is available follows immediately from 
tr(A <g> B) = tiA trB; 



■ exp 



-N 



min logtr/jg/jj- 

se[o,i] 



(20) 



Remarkably enough, this rate exponent, which we may 
call quantum Chernoff information because of its anal- 
ogy with C(po,pi), is asymptotically attainable, as fol- 
lows from the results of [4| . This is the quantum exten- 
sion of the classical result J3]) and was first conjectured 
by Ogawa and Hayashi in [2| ■ (h) If the two matrices po 
and pi commute the bound reduces to the classical Cher- 
noff bound ([1]), where the two probability distributions 
are given by the spectrum of the two density matrices. 

(iii) The function Q s = tr P Q P ]~ s (whose minimum gives 
the best bound) is a convex function of s in [0, 1], which 
means that a stationary point will automatically be the 
global minimum (see @ for a proof). This is a very useful 
fact when computing the quantum Chernoff bound (|14[) . 

(iv) Q is jointly concave in (po,Pi), unitarily invariant, 
and non-decreasing under trace preserving quantum op- 
erations Q. (v) The quantum Chernoff bound gives a 
tighter bound than that given by the quantum fidelity 



F(p ,Pi) = (tr-^VPo Pi Vp ) = ( tT \VP VPi\) 2 ' ( 21 ) 

which is the most widely used quantum distinguishabil- 
ity measure (see next section). This follows from the 
following set of inequalities: 



P e <P Q c< 



trpfpf < trlVPoV^J _ y/F(po, Pl ) 



(22) 



(vi) The quantum Chernoff bound can be easily extended 
to the case where the two states po and P \ (sources) are 
not equiprobable: 



P. < 



min 7Tn7r} s tipf ) p] s . 
se[o,i] 



(25) 



(vii) The permutation invariance of the iV-copy den- 
sity matrices, pf N , guarantees that the optimal collec- 
tive measurement can be implemented efficiently (with a 
polynomial-size circuit known as quantum Schur trans- 
form) [21], and hence that the minimum probability of 
error is achievable with reasonable resources. 

As stated above, for multiple-copy discrimination the 
error probability decreases exponentially with the num- 
ber N of copies: P c ~ exp [— ND(po, pi)] as N goes to 
infinity [15j. The error (rate) exponent D(po,pi) is de- 
fined generically by 



D(po,Pi) = - Jim ^ logP e 

N~ too Jy 



(26) 



and characterizes the asymptotic behavior of the error 
probability. From (|20)) we readily see that if the best 
(joint) measurement is used it coincides with the quan- 
tum Chernoff information, 



Dqc(po,Pi) 



min logtrrtn/o} 

se[o,i] u 1 



(27) 



where the equality holds because of the attainability 
of (|2H|) discussed above and we have added the sub- 
script QC. Moreover, this asymptotic value is also at- 
tained by the scmare root (or "pretty-good") measure- 
ment (see [iH [23| for the precise definition). This im- 
mediately follows from the known bounds 0, P c < 
pSRM < 2 p c; w here P C SRM is the error probability of dis- 
crimination when the square root measurement is used. 

Before closing this section, we briefly come back to the 
fidelity bounds in (|2"2"1 12"1)) and simply note that the first 
two inequalities translate into the following bounds to 
the rate exponent: 



1 



lo gj F(p , Pl ) < D QC { Po , Pl ) < -logF(p , Pl ). (28) 



If one of the states is pure Eq. (|24[) implies that the fac- 
tor 1/2 in l|28p becomes 1 and we have the exact relation 



Dqc(po,Pi) = - lagF(fto,pi). 



(29) 



In fact, the fidelity also provides a lower-bound to the 
probability of error jla |: 



l-y/l-F(po, P l) 



< P. 



(23) 



In the case where one of the states (say po) is pure 
the upper bound to the error probability can be made 
tighter [H[I1: 



P- < Pi 



QC 



Q 

2 



1 



F{po,Pi)- 



(24) 



IV. DISTINGUISHABILITY MEASURES 

In this section we aim to define a measure of distin- 
guishability between states using the results reviewed in 
Sec. IIIII Before doing so we will briefly outline how clas- 
sical statistical methods can be used to (partially) ac- 
complish this goal. We will then discuss an operational 
measure of distinguishability based on the error probabil- 
ity in multiple-copy state discrimination, leading to the 
quantum Chernoff measure. Finally we will define the 
analogous quantity for local discrimination protocols. 
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A. Classical statistical approach 

The notion of distance between states is a fundamental 
issue that has been studied for a long time. A straightfor- 
ward way to define such a distance is to take any suitable 
norm in the space of states. However, a more physical 
approach, kick-started by the pioneering work in [20], is 
to relate the inherently probabilistic nature of quantum 
measurements to classical statistical measures of distin- 
guishability between probability distributions. 

In particular, the author in [26| uses the notion of sta- 
tistical distance, 

ds (Po > Pi) = arccos VHPo.Pi), (30) 

as a measure of distinguishability between the probability 
distributions po and p\ , where 

F(P0,Pi) = (E vW)J (31) 

is the statistical fidelity. Accordingly, he defines a distin- 
guishability measure between quantum states po and pi 
by maximizing ds(po,Pi) [i.e., minimizing 3^(j>o,Pi)] over 
all possible POVM measurements, characterized by all 
possible sets of operators {E^^_ x with outcome proba- 
bilities given by Po(b) = ti(Ef,po) an d Pi{b) = t r (-^f>/°i)- 
The statistical distance as such makes sense only when 
the number of samplings of the probability distribution 
is large. Hence, in the quantum extension of this no- 
tion it is implicitly assumed that one performs the same 
measurement on each of a large number N of copies of 
the state p £ {po,pi}- The optimization over such local 
repeated measurements leads to one of the most widely 
used distinguishability measures [27j : The (quantum) fi- 
delity F(p ,px), defined in (|2Tj) . 

The fidelity, or statistical distance, has many desirable 
properties: (i) it is easily computable; (ii) for pure states 
it reduces to the standard distance given by the angle 
between rays in the Hilbert space IK; (iii) as mentioned 
above, it provides bounds to P e . Nevertheless, a strict 
physical interpretation is so far unclear, and its definition 
is based on repeated local measurements, while quantum 
mechanics allows for much more general ways to access 
the information contained in the N copies, via collective 
measurements on the whole of them. 



B. Quantum Chernoff distance 

A very natural and also operational distinguishability 
measure is provided by the error probability of discrim- 
ination. As a first candidate, one could take this very 
error probability P e for a given fixed number N of copies. 
However, the choice of a particular N in such a defini- 
tion would not only be arbitrary but also problematic 
since one can find examples [TEj where P e (po, pi', N) > 
P e (p' ,p{;N), whereas P e (p , pv, M) < P e (p' , p[; M) for 



a different number M of copies. A straightforward way 
to go around this problem is to use the asymptotic ex- 
pressions for N — > oo and define the distinguishabil- 
ity measure as the largest rate exponent in (|2"o]) . We 
further note that the presence of the logarithm ensures 
that D(po, pi) — if and only if po — pi, while the minus 
sign makes distinguishability decrease as discrimination 
becomes more difficult, i.e., as P increases. 

The quantum Chernoff information, Dqc(po, Pi), is 
therefore a physically meaningful and efficiently com- 
putable distinguishability measure. Note that (|27|) does 
not stricto sensu define a distance, since it does not ful- 
fil the triangular inequality. It has however all of the 
other properties that one should expect from a reason- 
able measure. This, in itself, is already a remarkable fact 
since, as far as measures and metrics are concerned, there 
is usually a compromise among operational definiteness, 
computability and contractivity [281 ]. For instance, the 
distance proposed in [29j , although having an operational 
definition, is not contractive. 

We point out that another operational distinguishabil- 
ity measure can be obtained in asymmetric hypothesis 
testing by minimizing the type II error rate while keep- 
ing the type I error rate upper-bounded by a fixed value. 
The optimal error rate in this situation is provided by 
the quantum Stein's Lemma [l(J [ll| and leads to the 
well known quantum relative entropy. Despite of having 
an operational meaning, the quantum relative entropy 
has two obvious drawbacks as a distinguishability mea- 
sure: it is not symmetric on its arguments and it diverges 
if one of the states is pure. 



C. Classical Chernoff distance: local measurements 

In the derivation of the quantum Chernoff bound one 
optimizes over all possible quantum measurements, in 
particular over quantum joint measurements on 3{® N , 
that act over all the N copies coherently. It is of great 
interest, both theoretically and in practice, to know 
whether such joint measurements are strictly necessary 
to attain the bound or one can make do with separa- 
ble ones (which include those that can be implemented 
with local operations and classical communication, sim- 
ply known as LOCC measurements). As far as we are 
aware, the answer to this is unknown. This question is 
also relevant in connection with the operational meaning 
attached to D(po,pi). In this section we focus on this 
operational aspect and compute D(po, pi) from its defini- 
tion in (|26p assuming that the discrimination protocol P c 
refers to is constrained to make use of the same individual 
measurements, defined by a local POVM {E(b)}^L 1 , on 
each of the N available copies. We loosely refer to these 
protocols as local. Local protocols are relevant from the 
theoretical point of view since they help to elucidate the 
role of quantum correlated measurements in asymptotic 
hypothesis testing. For example, in quantum phase esti- 
mation local measurements suffice to achieve the collec- 
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tive bounds Here, we will show that these protocols 
do not achieve the quantum Chcrnoff bound. In addition, 
from a more practical point of view, local protocols are 
much simpler to implement experimentally, specially in a 
situation where the number of sub-systems is increasingly 
large. 

In such a local protocol, after the measurements have 
been performed we have a sample of N elements of the 
probability distribution Pi{b) = tr(EbPi), i = 0, 1, based 
on which we have to discriminate between the candi- 
date H or Hi . In such a scenario the error probability, 
which we call P loc , can be obtained using the classical 
Chernoff bound ([I]) applied to the distributions pq and pi . 
One can thus define the error exponent (|26|) and thereby 
introduce a new operational distinguishability measure 
based on local discrimination: 



DcciPo, Pi) = ~ mm min log 

{£,,} sG[0,l] 



b 



(32) 



where the subscript CC reminds us that we have made 
use of the classical Chernoff bound. 

The measure Dqc(po, Pi) is obtained by maximizing 
the rate exponent over all possible single-copy generalized 
measurements {Pfe}^ (just as is done for the fidelity). 
Unfortunately, there is no simple closed expression for 
this maximum for general mixed states. However, we do 
encounter again the relation (f2"2"| with the fidelity: since 
the square root of the statistical fidelity ^(pojPi) upper 
bounds Pec in ©i it also upper bounds the local error 
probability Pl° c . That is, 



P c loc < Pec < mmV J(M) - 



and 



£>cc(po,Pi) > -^logP(p ,Pi) 



(33) 



(34) 



Since Pqc(po,Pi) > -Dec (po > Pi ) , we note that when- 
ever Dq C (pq,pi) = -(l/2)logF(p 0) pi) the inequal- 
ity p4[) has to be saturated. This, in turn, means that in 
this situation one can optimally discriminate between Hq 
and Hi just by performing a fixed local measurement on 
each of the N copies (no collective measurements are re- 
quired to attain the quantum Chcrnoff bound). 

There is still another important situation when 
the quantum Chernoff bound is attainable by lo- 
cal measurements: when one of the states (say 
po) is pure. If this is the case, Eq. (|24|) holds 
and £>Q C (po,Pi) = -logP(p ,pi). To prove that 
£>cc(po,Pi) = Dq C {po,Pi), let us consider the two- 
outcome measurement defined by Eq — po, E± = 
1 - po. Note that po(l) = tr(Pi/9 ) = and 
Po(0) = tr(PoPo) = 1- After performing this measure- 
ment on each of the N copies the protocol proceeds as 
follows: we accept Hq if all of the outcomes are 0, other- 
wise we accept H\. One may refer to this classical data 
processing as unanimity vote (30j . The error probability 



can be easily computed by noticing that no error occurs 
unless we get N times the outcome [since po(l) = 0]. 
Therefore, 



P e loc = 7r 1 pf(Q)=7r 1 [tr(p pi)] JV =7r 1 [P(p ,Pi)] IV , (35) 

where the last equality holds because po is assumed to be 
a pure state. From this equation it follows immediately 
that L>cc(po,Pi) = -logP(po,Pi) = D QC (po,Pi), and 
the quantum Chernoff bound is attainable by local mea- 
surements. It also follows from the first equality in (|35|) 
that this result corresponds to taking the limit s — *■ 
in Q. 



N 



V. METRIC 

The set of states of a quantum system, as that of clas- 
sical probability distributions on a given sample space, 4 
can be endowed with a metric structure [3l|, and thus 
thought of as a Riemannian manifold. This enables 
us to relate geometrical concepts (e.g., distance, vol- 
ume, curvature, parallel transport) to physical ones (e.g., 
state discrimination and estimation, geometrical phases). 
Among the novel applications of metrics in quantum in- 
formation, they have been recently used to characterize 
quantum phase transitions [32j]. 

The first step towards this geometric approach to quan- 
tum states is to define the line element ds or (infinites- 
imal) distance between two neighboring "points" p and 
p — dp. All local properties follow from this definition. 
More precisely, they follow from the metric, i.e., from 
the set of coefficients of ds 2 when written as a quadratic 
form in the differentials of the coordinates (parameters) 
that specify the quantum states. There is, however, no 
unique choice of ds unless some monotonicity conditions 
are invoked. 

For classical probability distributions, {p(b)}, a line 
element is singularized (up to a propotionality factor) 
by imposing that it be non-increasing under stochastic 
maps. It is the well known Fisher metric (in what fol- 
lows the terms metric and line element will be used in- 
terchangeably) : 



ds-v = 



[dp{b)f 



4^ P( b ) 



(36) 



In contrast to the classical case, the monotonicity con- 
dition under completely positive (quantum stochastic) 
maps does not define a metric uniquely, which explains 
why a substantial body of research on quantum metrics 



4 For sake of clarity, in this section we assume a finite sample 
space, but the results hold also for general probability measures 
over continuous spaces. 
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has emerged over the last years. Among the main devel- 
opments, Petz [33[ has characterized the family of quan- 
tum contractive metrics by establishing a correspondence 
with operator-monotone functions. 

An alternative, more physical approach is to define a 
line element from a suitable distinguishability measure 
between infinitesimally close states. A remarkable ex- 
ample is given in [341] . In this seminal paper Braun- 
stein and Caves consider a one-parameter family of states 
p(9) and map the problem of distinguishability to that of 
estimating the parameter 9 optimally. They define a line 
element, ds^C' as d9 2 expressed in the appropriate units 
of statistical deviation (roughly speaking, d9 2 divided by 
the minimal error in the estimation of 9). By making 
use of classical statistical methods (Cramer-Rao bound) 
they find 



ds 



BC 



4 max ds-n 

{E b } 



max/p d9 

{E b } 



(37) 



where /p = Tlb\dp{b) / d9] 2 / p(b) (it * s the so called Fisher 
information), with p(b) = tr[Ebp(9)], and the maximiza- 
tion is over all possible POVM measurements {Eb} on a 
single copy of p(9). They also succeed in giving a closed 
expression for dsg C and show that their metric coincides 
up to a factor with that induced by the Burcs-Uhlmann 
distance 1351 [3611 



^BU(P0, Pl)=V2 1 - y/F{p ,p{) 



1/2 



(38) 



More precisely, they show that ds# c — 4dsg U; where 

dsBv = [ d Bv(p, P ~ dp)] 2 (39) 

[see also (|69p below] and a series expansion to 0(dp 2 ) is 
understood in the right hand side of this equation. We 
note in passing that for commuting states, i.e., classi- 
cal probability distributions, the Bures-Uhlmann line ele- 
ment ds|u coincides with the Fisher metric (|36|) . A quan- 
tum metric with such normalization is said to be Fisher 
adjusted. 

Although one can obtain a finite distance cfecKpo, Pi) 
for arbitrary states po and p\ by integrating dsec along 
geodesies, it is important to notice that the operational 
meaning of the Braunstein and Caves metric is lost in 
the process. 

In the spirit of Braustein and Caves' physical approach 
to metrics, we next consider the distinguishability mea- 
sures -Dqc and Dec, discussed in Section llVl for infinites- 
imally close states and derive line elements with the same 
operational meaning, which we call d-SQc and dscc re- 
spectively. For c?sqc we also give the volume element 
and the prior probability distribution, whereas those cor- 
responding to the metric dscc can be easily found in the 
literature since, as will be shown, eLs^c * s proportional to 
the widely-studied Bures metric ds^jj. 

Before we start we would like to point out that one 
could also consider line elements induced by other quan- 
tities, such as the quantum relative entropy, which, as 



we saw above, also has a clear operational interpreta- 
tion. The quantum relative entropy induces the so-called 
Kubo-Mori metric [37| , which has the drawback of being 
singular for pure states. 



A. Quantum Chernoff metric 

For neighboring density matrices p and p — dp (e.g., 
those for which their independent matrix elements differ 
by an infinitesimal amount) the distinguishability mea- 
sure D(p,p — dp) defines a metric, as in (j39|) . For the 
quantum Chernoff measure, -Dqc> this metric can be 
computed from Eq. ([2"?)) [38j: 



ds QC = 1 - s ™jn tr[p s (p - dp) 



(40) 



where the dots stand for higher order terms in dp that will 
not contribute to ds 2 and we have also used that log y = 
y — 1 + We now recall the integral representation 



sin(i7r) 



and its derivative, 



t _ x sin(tTr) 
ta = 



ax 

dx : 

o a + x 



dx 



< t < 1 



(41) 



(a + x) 2 



-1<*<1. (42) 



These representations hold for a > and can be straight- 
forwardly extended to positive matrices. In particular, 
using (l4Tj) and the convergent sequence 



1 



a — b 



a 1 + a 1 ba 1 + a 1 ba 1 ba 1 



(43) 



which also holds for matrices provided a > b, one can 
write, up to second order in dp, 



(p—dp) 1 s =c s / dx(p-dp) 
Jo 



p — dp + x 
1 



(44) 



ic s 1 dxx s (p—dp) 

/o \P + x 



-dp 



where c s = 
one finds 



. dp dp 

p + x p + x p + x p + x p + x 



sin(s7r). Inserting this expansion in (|40[) 



dsQc 



max c s / dxtr 



x^' 
(p + x) 2 



p s dp 



■ T^p'dp ; dp 

(p + xy 



1 

p + x 



(45) 



The first term in the integrand vanishes, as can be seen 
by using (|42p and trc?p = 0, while the second term can 
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be computed in the eigenbasis {\i)} of p; p — J^i Ai|i)(z|: 

KW\dR\3)? 



ds QC =] 



sG(0 



x y^ c s / 



dxx 



2 «e(o,i) 4^ (Ai - Aj) ; 



iax } 

(o,i) 



(A, • A, A; A J s — AjA, s ) 



(46) 



where in the second equality we have taken into account 
that dp = dp\ which enabled us to symmetrize the ex- 
pression in parenthesis that multiplies | (i\dp\j) \ 2 in the 
sum (this symmetrization gives the factor 1/2). The 
quantum Chernoff metric can be finally written as, 



^ S QC 



\(i\dp\j)\ 2 



(VA7+VA7) 



(47) 



The quantum Chernoff metric belongs to the family of 
contractive quantum metrics, as it should, since by con- 
struction the probability of error cannot be improved by 
a pre-processing of the states. In fact the quantum Cher- 
noff metric coincides with a member of this family that 
has been explicitly written by Petz in [3^] and with the 
so called Wigner-Yanase metric, which has been recently 
studied in depth by the authors of _4Qj. In particular, 
the geodesic distance, the geodesic path, and the scalar 
curvature of the quantum Chernoff metric can be read 
off from their Eqs. (5.1-5.3). 

By separating diagonal from off-diagonal terms, the 
metric in (1471) can also be written as 



^ S QC 



E 



(dXtf 

8A ? ; 



E 

Kj 



\(i\d P \j)\ 2 



(VA7 + 



A,) 2 



(48) 



Next, we wish to identify the degrees of freedom in the 
off-diagonal terms. We will see that they correspond to 
infinitesimal unitary transformations acting on p (which 
leave its eigenvalues unchanged). This is most conve- 
niently done by parameterizing p by its eigenvalues and 
eigenvectors, namely by A^ and the components of \i) 
onto a given canonical basis {|afc)}: 



U k i = (a k \i) = (a k \U\ai 



(49) 



(naturally, it also holds that U k i = (k\U\i)). A neighbor- 
ing density matrix p' = YliKl^')^'] ^ s thus parameter- 
ized by A^ = A 4 + dX t and U' ki = U ki + dU ki = (a k \i'). 
We further note that = (1 + 8T)\i), where ST is an- 
tihermitian, ST> = —ST. It is actually the infinitesimal 
generator along the direction in parameter space that 
takes {\i)} into {\i')}. It follows that dU kl = (a k \5T\i). 
The matrix elements of dp can be expressed as 

(i\d P \j) = (i\( P ' - P )\j) = j2m(k'\j)K. - a,% 

= dXiSij + (A, - Xi)(i\5T\j) + 0(ST 2 ), (50) 



and those of ST as 

mST\j) = J2(iM(a k \ST\j) = J2u* ki dU kj 

k k 

= Y,( a ^\ a k)(ak\dU\ aj ) 

k 

= ( ai \UUU\ aj ) = (uUu).., (51) 

where we have used (|49[) in going from the first to the sec- 
ond line [the very same matrix elements of ST can also be 
written as (dll U^)ij in the eigenbasis of p\. Substituting 
these relations back into (1481) we obtain 



ds QC =E 



{dXtf 
8X Z 



E(^-V^) ( UUU ) 13 ■ ( 52 ) 



<<j 



The same expression can also be derived by differentiat- 
ing 



P = c/V (0) u, 



(53) 



where p^ = ^ f Aj|aj)(o!j| is diagonal in the canonical 
basis and has the spectrum of p. 

Eq. (|52"|) displays the metric dsq C in a very suggestive 
form. Any density matrix can be parameterized by its 
eigenvalues {Ai} and the unitary matrix U that diago- 
nalizes it. Eq. (|52|) expresses the infinitesimal distance 
between two such matrices in terms of these very pa- 
rameters. The first term is immediately recognized as 
the (Fisher) metric on the (d— l)-dimensional simplex of 
eigenvalues of p, which is assumed to be d x d through- 
out the rest of this section (note that J^. Aj = 1, which 
implies J^. d\i = 0). Thus, stricto senso, it should be 
expressed in terms of a set of d — 1 independent eigen- 
values. If we choose this set to be {Xi}f~^ the first term 
in (l52l) becomes 



^ d-l 

o ^gpdXidXj, 



(54) 



where the subscript F stands for Fisher, and 

gF J = l~ + =5=1 - ; *«=lforl<i,i<d-l. (55) 

It follows that the determinant of gp, which we will need 
below, is 



det^F = (Ai • • • X d -iX d y 



(56) 



The second term in (1521) contains the fac- 



tors \(U'dU)ij\ 2 , which are invariant under left- 
multiplication [since the left-hand side of (|5"Tj) is 
independent of the choice of basis Hence, the 

normalized volume element induced by these terms 
will coincide with the (unique) Haar measure dVn 

of U (d) I [U \l)] d , known as the flag manifold Fl^f* (see 
e -g-i [Si an d references therein). Using the wedge 
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product of differential forms, this Haar measure can be 
written as 



f\ Re(UUU) i3 A Im(UUU)i 

Kj 



(57) 



where Ch is a normalization constant so that J dVn = 1 . 
Note that the one-form basis in (|57| contains 2 x 
[d(d — l)/2] (real and independent) elements, which in- 
deed coincides with the d 2 — d independent parameters 
of U(d)/[U(l)] d . 

Volume elements (derived from metrics) are of great 
interest because they give a canonical way of defining 
prior probability distributions on continuous sets. Ac- 
cording to this approach, Eqs. (| 52115?) provide a means 
to define such probability distribution for general density 
matrices: if 6 = (9\, 8 2 , . . .) is a set of independent real 
parameters that specifies the density matrices as p(0) 
and the metric is written as ds 2 = dQgdO 1 (i.e., g is 
the metric tensor), then we can define the prior V[p(6)] 
through the r elation V\p{0)\ Y[ a d9 a = dV/ J dV, where 
dV = y/detg H a dO a . It follows from (52} that V[p{0)] 
is the product of two independent probability distribu- 
tions: one that depends exclusively on the parameters 
encoded in the unitary matrix U and expresses the fact 
that they are simply distributed according to the Haar 
measure dVn; and one, denoted as 3 3 ({A i }), that gives the 
probability distribution of eigenvalues. The latter can be 
written as 



nM)=^n^=<i-^)n( 

i v i<j 



where for a given dimension d the constant Cd is chosen 
to ensure that probability adds up to one. 

The prior distribution on the simplex of eigenval- 
ues of p for the Bures metric (see below), analogous 
to CP({Ai}) in (55} , was proposed in [12], but it took con- 
siderable efforts to compute the right normalization con- 
stant. Slater [43| gave values for dimensions d = 3,4, 5 
and finally Sommers and Zyczkowski [44j managed to 
give a general expression for arbitrary finite dimensions. 
Here we will compute Cd following similar techniques. 

The coefficient Cd is defined by the normalization con- 
dition J T({Ai}) rii d\i = 1. Thus, C d = 1(1), where 



Jo i VA 4 v ' i<y 



A,- -\/\ 



(59) 



Although we only need this integral for r = 1, the intro- 
duction of this radial parameter r enables us to compute 
the normalization 1(1) more easily. We first note that by 
re-scaling A-; — > r 2 Xi one gets 



J(r) 



„d -2 



Kl) 



(60) 



[i.e., I(r) is a homogeneous function of r of degree d 2 
and thus 



drre~ r I(r) = 1(1) / drr a 
Jo 



It follows from this equation that 
2 d 



-2], 
(61) 



C d = 1(1) 



r(d 2 /2) Jo 



n 



2\/A~ 

2 



(62) 



This expression can be further simplified by the change 
of variables A, — > <, = \/A7, which leads to 



r(rf 2 /2) 7 



(63) 



i<j 



By expanding the square of the Vandermonde determi- 
nant flv<j(it — one cou ld in principle compute Cd 
in terms of Euler gamma functions. However this is 
very impractical since the number of terms in such an 
expansion grows exponentially with d. A much more 
efficient way to proceed is as follows. Let {Pk(t) = 
akt k + ak-it k ^ 1 + . . . + a\t + a }, ^ 0, be a fam- 
ily or orthonormal polynomials in the set [0, oo) with a 
weight function of Hermite type, so that 



die"' P k (t)Pi(t) =S kl . (64) 



Note that {Pk(t)} are not Hermite polynomials, since the 
integration range is [0, oo) instead of (—00,00). Now, if 
we define the renormalized polynomials Qk (t) = P k (t) / a k 
it is not hard to show that 



Q d -i(h) Q d -2(h) 
Q d -i(t 2 ) Qd-2(t 2 ) 



!d-\ 



(td) Qd-2(td) 



Qo(h) 
Qo(t 2 ) 

Qo(t d ) 



(65) 



Substituting in to 
Pfc, one has 



C d 



and using the orthonormality of 



2 d d\ 

TO 



n- 

fc=0 



(66) 



In contrast to the examples considered in Ref . [44( , and as 
far as we are aware, there is no known closed expression 
for the leading coefficients a k for the case at hand. How- 
ever, Eq. (|6"6"|) provides an efficient way of computing the 
quantum Chernoff normalization constant Cd', e.g., by 
applying the Gram-Schmidt orthogonalization algorithm 
[with the internal product defined in Eq. (|64D ] one easily 
obtains the coefficients a k , and thereby Cd ■ We give the 
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value of this constant for d < 6: 
C 2 = 7T-2; 
C 3 = 1(^-3); 



C 4 

c 6 



6tt 2 - 29tt + 32 

6720 ' 
128(727r 2 - 4357r + 656)^ 

21082276215 ' 
9(480tt 3 - 3747tt 2 + 9352tt) - 65536 
2023466257612800 ' 



(67) 



B. Classical Chernoff/Bures metric 



From the local measure Dcc(po, Pi), Eq. ([52")) , one 
can readily obtain the corresponding local metric. If 
<p (b)= ti(pEb) < 1 for every measurement outcome b, 
direct differentiation of -Dec (pi P ~ dp) leads to 



2 {E„} 



F) 



(68) 



where dsp is the Fisher metric with p(b) — tr(pEt,), 
dp(b) = tr(dpEb) and s* = 1/2 being the value of s that 
achieves this minimum in (|32[) . The maximization of (|3€>[) 
over the local measurements {-Ef,}^!, which commutes 
with the minimization over s as long as p(b) ^ 0, 1, results 
in M 



ds 2 



1 



Aj + A,- 



or equivalently, 

(dA,) 2 



E 



4A 5 ; 



E 



(A, - A,-)- 
A, + A, 



\{uUu)i 



(69) 



(70) 



where we use the same notation as in (14"7T) and (f5"2"l) . re- 
spectively. This is the Bures-Uhlmann metric, which, as 
mentioned above, can be also obtained from the Bures 
distance (J2H]) [45] . From dHHJ) we then have 



dscc 



1 



r/s 2 
" S BU 



1 



[l-Ffop-dp)] 



(71) 



for strictly mixed states (the last equality holds to or- 
der dp 2 ). The corresponding prior probability distribu- 
tio n (quan tum Jeffreys prior) was derived and calculated 
in H| El,!!! ■ 

If one of the states is pure (say po , as in previous sec- 
tions) then the classical distribution p(b) becomes degen- 
erate [p(0) = 1] for the optimal choice Eq = po (recall the 
last comments in Sec. II V Cj) . and the previous derivation 
does not hold. In this case, the optimal choice of s in (JTJ) 
is obtained by taking the limit s — > 0, as we already dis- 
cussed in Sec. IIV Cl Recalling the first equality in ([3"S"]) . 
we obtain Dcc{p, P — dp) = — log[p(0) — dp(0)] = dp(Q) 



[note that dp(0) > since 1 > p(0) - dp(0) = 1 - dp(0)], 
which is linear in dp(b) and therefore does not define 
a proper metric in probability space. From the results 
of Sec. IIV Cl we also know that if one of the states is pure 
then Dqc(po, Pi) = — l°g-P 1 (po, Pi) and therefore 



ds 



cc 



1 — F(p, p — dp) = ds 



BU 



(72) 



for pure states. This agrees with the previous discus- 
sion since dp(Q) = 1 — F(p,p — dp) if p is a pure state. 
Eq. (|T2")) has to be taken with special care. It gives a 
valid metric for the set of pure states (which only in- 
cludes variations in the unitary parameters), i.e., when 
p — dp is also a pure state (p — dp= UpW). Moreover, for 
pure states efe cc coincides with the Fubini-Study metric 
[recall that the Bures-Uhlmann metric is Fubini-Study 
adjusted [!|, hence this statement follows from Eq. (|72p]. 
By combining Eqs. (fTTj) and (|72[) . we see that ds cc 
shows a discontinuity when the mixed state p approaches 
the set of pure states. The quantum Chernoff met- 
ric (|4"T)) does not have this pathology. This can be 
seen by comparing the i < j (d\i — 0) terms in (|52|) 
with those in ([70]) (the diagonal terms i = j coincide). 
As Aj — > Sij (p approaches a pure state), we read- 
ily see that c?Sq C — > rfs|u- ^ nc opposite situation, 
when p approaches the completely mixed state l/d, we 
can write A^ = l/d+ ej, where tj approaches zero. Ex- 
panding the i < j terms in both ([52"]) and ([70]) we can 
check that dsq C = ^ds 2 ^ up to terms of order e 3 . We 
conclude that the quantum Chernoff metric smoothly in- 
terpolates between the two components (that on strictly 
mixed states and that on pure states) of the local met- 
ric ds^Q. We will come back to this point in the next 
section, where qubit states are discussed as an example 
to illustrate the results in this and in previous sections. 



VI. QUBIT STATES 

In this section we apply our results to qubit mixed 
states, that is, general two-dimensional states. We will 
first study the distinguishability measures -Dqc and -Dec 
and then move on to the corresponding metrics and pri- 
ors. 

For qubits one has p L = (1 + r*j • <?)/2, i = 0, 1, where 
fi is the Bloch vector of pi, < \fi\ = Ti < 1. The 
eigenvalues of pi are pi = (1 +rj)/2 and pi = 1 — pi. It 
is straightforward to obtain 



txPoPi 



l-s 



(PO pi 



\P0 Pi 



Po'pi 1 S ) sin2 2' 



(73) 



where 9 is the angle between and r\. The value of s 
that minimizes Q s and hence gives (|14p and l|2"7| is in 
general a function of and 9. However, one can check 
that in the particular case rg = r = r\ the minimum is 
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at s* = 1/2 5 . 

In Fig. [2] we plot the quantum Chcrnoff distinguishabil- 
ity measure BqcCpojPi) and the measure based on local 
measurements Dcc(po, Pi) together with the bounds (j28|) 
provided by the fidelity, for states of equal purity r$ = 
r*i = r and for 9 = ir/2. Notice that in general local 




0.2 0.4 0.6 0.8 1 



FIG. 2: (Color online) Measures of distinguishability between 
two-qubit states with relative angle 9 = n/2 for different val- 
ues of r — ro = n: Values extrapolated from exact evaluation 
of the probability of error for 30 < N < 35 (dots); bounds 
provided by the fidelity, Eq. (|28[) (shaded); measure based 
on identical local measurements, i.e., Dcc(po, pi) (dashed); 
measure based on collective measurements, i.e., Dqc(po, Pi) 
(solid line). 

measurements perform much worse than the collective 
ones and Dcc(po, Pi) runs remarkably close to (actu- 
ally, coincides with) the fidelity lowerbound (|2"5|) for most 
values of r. However, as it approaches the pure-state 
regime (r — ► 1) it rapidly increases towards its upper- 
bound. The reason for this rapid change can be under- 
stood by recalling the unanimity vote protocol discussed 
in Sec. lIVCl For two pure states, pi = (as corre- 

sponds to r = 1), it boils down to [3(| projecting along 
one of the states, say \tpo), and its orthogonal, |^>q~). After 
performing this measurement on each of the N copies, if 
all of them project on \ipo), one claims that the unknown 
state is (hypothesis Hq). However, if at least one 
of them projects on |i/>o~) the guess is \ipi) (one accepts 
Hi). This corresponds to £ = 1 in (J7]) . For pure states it 
reaches the joint-measurement Chernoff bound by mak- 
ing use of a much less demanding local-measurement pro- 
tocol (see also [H, Efil for the optimal local strategy for 
finite N). 

In contrast, near the completely mixed state 1/2, for 
low r, the optimal local strategy consists in choosing the 
measurement {Eq, E\} such that p = po(0) = tr(po-E'o) = 
tr(piEi) = pi(l) = q, with p > 1/2. In this case, 



5 Qubit states are an example for which the doubly stochastic 
matrix Dij = \(i\U\j)\ 2 is symmetric (Dij = Dji). There- 
fore, for isospectral states, Q s (p,UpU^) = \f\^~ s Di y j = 
A j _S + X j X l~ S ) D ij> which has its minimum at s* = 1/2. 



the acceptance of either Hq or Hi is done on the ba- 
sis of a majority vote protocol: Hq is accepted if the 
outcome occurs more times than the outcome 1 does, 
i.e, N = N/2 [see also Eq. ©]. It follows from dH 
that s* = 1/2. Therefore, the lower-bound provided by 
the fidelity, Eq. (|2"5)) . is saturated [s = s* = 1/2 satu- 
rates the second inequality in (|33p and thus it also sat- 
urates ([51)) ]. This protocol is optimal up to a given 
value of the purity, i.e., for r < r*(6). For larger val- 
ues of r the 'voting rule' (given by £) starts changing and 
so does s*. Accordingly, £>cc(po, Pi) moves away from 
its lower-bound to end up saturating its upper bound 
at r = 1. 

We next consider the metrics induced by local and 
by joint measures. The former, in particular, requires 
special attention because of the abrupt behavior of 
■Dcc(pO)Pi) near the set of pure states. Indeed the 
critical value r*{9), beyond which majority vote is no 
longer optimal, goes to one as the relative angle be- 
tween the Bloch vectors of the states becomes smaller; 
r*{6) — > 1 as — > 0. As a result, the sudden increase 
of £>cc(pii P2) develops into a jump discontinuity at r = 
1 [from -(l/2)logF(p ,Pi) if r < 1 to - logF(p , pi) 
if r = 1]. For this reason, when defining the correspond- 
ing metric we have to distinguish these two regions: the 
set of strictly mixed states (r < 1) and the set of pure 
states (r = 1). 

In the region r < 1 the outcome probabilities will never 
be degenerate and the metric reduces to the Fisher met- 
ric, which upon optimization over local measurements 
coincides with one-half the Bures metric: 

d4c=^4u = ^( T ^+rW ) ), (74) 

where dQ 2 = d9 2 + sin 2 9d(f> 2 is the usual metric on the 
2-sphere. 

In the region r = 1 (pure states), the before- mentioned 
unanimity vote protocol is optimal and the resulting met- 
ric is 

ds cc = ^ dfi2 = ds FS; ( 75 ) 

where dsp S is the well known Fubini-Study metric, which, 
as mentioned above, also coincides with the Bures metric 
(fogy in the limiting case r — ► 1. We notice again that 
dsQ C in Eq. (|75j) is a factor 2 larger than lim r ^i cLsq C in 
Eq. ([74"]) . where the limit is taken along the lines dr = 0. 
The local distinguishability measure thus induces a dis- 
continuous metric or, phrased in a different way, two dif- 
ferent metrics for pure states or for strictly mixed states. 

This can be visualized using the Uhlmann represen- 
tation, that is, by embedding the Bloch sphere r < 1 
in R 4 . To this end, one simply needs to define the new 
coordinate as t = cost, where sinr = r. In spherical co- 
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ordinates one has 



dsQ C 



- (dr 2 + sin 2 t dfl 2 ) ; < t < tt/2 



-dn 2 ; 
4 



(76) 



T = TT/2 



where the first line correspond to strictly mixed states 
and the second to pure states. We note that in the second 
(first) line dsg C is nothing but the standard metric on a 
2-sphere (the top half of a 3-sphere) of radius 2~ x (2~ 3 / 2 ). 





FIG. 3: (Color online) Uhlmann representation of the set of 
single qubit states according to metric dsQ C , based on Pg OC 
for local repeated measurements (A and B), and according to 
the quantum Chernoff metric dsq C , based on P e for general 
joint measurements (C). 

In Fig. [31 A and B represent (the slice z = of) 
these two manifolds. One readily sees that the radius of 
B (pure states) is a factor \/2 larger than that of the 
limiting circle of A (for r — ► 1 t — ► 0, i.e., r — * 7r/2). 

The quantum Chernoff (collective-measurement based) 
metric can be readily obtained from ([27)) [or ([52]) partic- 
ularized to qubit mixed states]: 



1 



dr 2 



1 



2(1- Vl-r 2 ) dn 2 



(77) 



This metric quantifies distinguishability of qubit states in 
a precise and operational way, and encapsulates the full 
power of quantum mechanics. It approaches the Fubini- 
Study metric dsp S for pure states and also dsQ C for very 
mixed states, i.e. for small r. The metric smoothly inter- 
polates between the two regimes. By defining r = sin2r 
with < r < 7r/4 we obtain again the standard metric 
on a 3-sphere but this time of radius 1 / V2: 



dsQ C = i (dr 2 + sin 2 rdQ 2 ) 



(78) 



The corresponding manifold is denoted by C in Fig. [3] 
Geometrically the space of states endowed with the quan- 
tum Chernoff metric dsq C is a spherical cap defined 
byO<T<7r/4 whose radius is twice that of the Bures- 
like hemisphere A. In order to emphasize that the two 



metrics, are equal up to order r 3 at r « 0, i.e., r « 
(near 1/2), in the figure we have shifted the center of the 
larger sphere so as to make the two manifolds tangent 



at t = 0. 



The fact that dsQ C 



- irfs 2 

— 2 ai, BU 



= di 



QC 



0(r 4 ) 



is a particular example of a general relation that we dis- 
cussed at the end of Sec. IV Bl 

From the quantum Chernoff metric one can obtain a 
proper finite distance (satisfying the triangle inequality) 
by, for example, computing the geodesic distance, 



^Qc(po,Pi) 



arccos(cosTo cosn cos 9+ sin to sinri) 

; 71 ' (79) 



where Ti = sin 2r; and 9 is the relative angle between the 
respective Bloch vectors. 

The volume element and the prior distribution of den- 
sity matrices for qubit mixed states, which we here denote 
asV[p(r}], can be easily obtained from the above metrics. 
According to the local and quantum Chernoff metrics we 
have respectively: 



Vcc\p{r)\ = 



sm( 



7T 2 

sin 9 1 — y/l — r 2 

27T(7T-2) ^T~^ ' 



(80) 
(81) 



where it is understood that r and 9 are the length and 
the azimuthal angle of the Bloch vector of p. Since the 
Haar volume density on the 2-sphere is sin#/(47r), we see 
that the eigenvalues of p, \± = (1 ± r)/2 are distributed 
according to 



y cc (x±) = 



7T vT 



2 1 - Vl-r 2 



7T-2 72 



(82) 
(83) 



(One can check that the latter agrees with our results in 
Sec. [V]) This have been recently used in [47| to assess 
the accuracy of different quantum tomographic measure- 
ments. 



VII. GAUSSIAN STATES 

We now illustrate our results with infinite-dimensional 
systems. In particular we will focus on the family of 
single-mode Gaussian states. This is a very significant 
class of quantum states mainly for two reasons. First, 
it has a very simple mathematical characterization that 
allows for the derivation of otherwise highly non-trivial 
results, and, second, it describes accurately states of light 
that are realized with current technology. In the fol- 
lowing we show that the Quantum Chernoff information, 
besides being the natural distinguishability measure, has 
the advantage of being relatively easy to compute. The 
calculation of the fidelity, for instance,_is much more in- 
volved, as is apparent from [48, 49, 5fJ, HH, [HJ , where one 
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can find such calculations for different classes of gaussian 
states. 

Gaussian states are by definition those that have a 
gaussian characteristic function. The (symmetrically or- 
dered) characteristic function of one such state, p, is: 

X(u) = tr[D(u)p] = exp^-iuVC - ^^Tauj , (84) 

where t denotes transposition, a is the symplectic matrix 

1 



-1 



(85) 



and D(u) = cxp[i(u2<j — uip)] is the displacement op- 
erator, with u = (^1,^2)* and with position and mo- 
mentum operators satisfying [q,p] — i. The annihilation 
and creation operators, defined as a = (q + ip)/^/2 and 
a) = (q — ip) I -\/2, fulfil the canonical commutation rela- 
tions. The positivity of p implies that the 2x2 covari- 
ance matrix T is real-symmetric and satisfies r + ia > 0. 
A symplectic transformation is a linear transformation 
5*(g,p) that preserves the commutation relations, or 
more succinctly SaS 1 = a. Under such a transformation 
the displacement vector £ = (q,pY and the covariance 
matrix transform as £ = S£ as T = S TS l respectively. 

An equivalent, more physical, definition can be given 
by the action of the squeezing operator S(r, 4>) = 



exp[§(e 



e ''(a') )] and the displacement op- 



erator D(u) defined above, on a thermal state pp — 
(1 — c^ 13 ) ~^2 n e _/3n |n)(n|, where the Fock states \n) satisfy 
a^a\n) — n\n): 

pifi, f , r, 0) = D(e) f §(r, 0)V/3§(r, 0)D(O- (86) 

The covariance matrix of a thermal state is simply 
Tp = jpl, with 7J 1 = tanh(/?/2). The squeezing 
operator §(r, <p) induces the symplectic transformation 
S r ,<p = O^DrO^, where 



D r = 







e" 



, 0^ = 



cos <p sin (; 
— sin <h cos < 



(87) 



and the latter corresponds to a rotation in phase-space, 
i.e. to the unitary operation Q(<fi) = exp[i(f> (V a]. One 
thus finds that the covariance matrix can be written as 

r = ipSr^si^. 

In order to calculate the Chernoff bound it is suffi- 
cient to realize that any power p s of any Gaussian state 
p is also a Gaussian (unnormalized) state with a rescaled 
temperature: 

p((3, £, r, 4>Y = a5(0 f S(r, 4>) ] p s S(r, 0)2) (0 

= ^, s D(O t §(r,0)V s /3§(r-,0)2)(O 

where we have used the relation 



Pj = (l-e-") 



|nXn| = JV)j, a p s/ j, (89) 



with ATa !S = (1 - e- /3 ) s /(l - e"' 33 ). Recall now that 
given any two gaussian states pa and ps, one can write 
the inner product tr paPb in terms of their displacement 
vectors and covariance matrices as: 

\x{p A p B ) = 2 [det(r A + r B )]-* e - 5t ( r -+ r -)" 5 , (90) 

where 5 — £a ~ £,b ■ Using this equation we find that the 
quantum Chernoff bound (|14| is Q = min s Q s with 



MpopI- 8 ) 

= 2JV A ,, i iV A ,i_ a [detCfo + f !)]-f e 



1 ^(fo+f!)-^^ 



where f, = y s 3iS n ,d>iS* 



0, 1, and S = £ - £i- 



To simplify the notation we will denote the covariance 
matrix of the Gaussian state with (3 = as A = S r< <j,S* ^. 



A. States with equal covariance matrices 

If two general Gaussian states po and pi are identical 
modulo a relative displacement 5, i.e. p\ = D (5) poT) (S)^ 
we find that 



Qs 



-s t (r 1 +r 2 )- 1 s 



-(7s/3+7(l-s)/3) <5 ^4 i 



(92) 



where in the first equality we used the fact that the factor 
multiplying the exponential in (f9"2"|) must be equal to one, 
since it is independent of S and for S = one must have 
Pa = Pi, which implies that Q s = 1. That is, 



2N p ,.Np,i- a = [det{ lsP A + 1(1 _ s)p A)]^ = 

= 7s0 + 7(l-s)/3) 



(93) 



where we have used that symplectic transformations have 
unit determinant, i.e., detA = det(55') = 1. One 
readily sees that Q s , Eq. (192|) . attains its minimum 
at s* = 1/2, hence we find that in this case the Cher- 
noff measure is: 



Q = min Q s = exp 

s 

1 



exp 



exp 



1 -&A-H 

-fO+D^Op tanh^ 



2 7/3/2 



(94) 



\5f 



(e 



- 2r cos 2 . 



e 2r sin 2 i 



tanh : 



where 9 is the relative angle between the squeezing axis 
and the displacement vector, i.e., if 6 = O ip (\8\,Q) t 
then 6 = (p — <j). 



B. States with the same temperature 

We can generalize the previous result to states that 
have the same spectra, i.e., the same temperature (Po = 
(3i = (3). In this case we can use (p?3"f to find 



= (lsf3 + l(i- s )p) det[j sl3 A + j(i- s )pAi\ 2 

x exp [S^jspAo + j {1 - s)l3 A 1 )- 1 S] . (95) 
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The determinant can be explicitly written in a compact 
form as 

det[7 s/3 l + 7(x-»)/9-A] = lip + 7(i_ s )/3 

+ 2 7s ,37 (1 _ s)/3 cosh(2i?), (96) 

where we have defined 

= SrcfoSrirfiiS^^S^^Y = Sr^Sr^*, (97) 

with 



cosh2i? 



- </>i)cosh[2(r - r i)] 
6o-0i)cosh[2(r o + n)]. (98) 



With this generality s*, the optimal value of s, is a 
complicated function of the states' parameters 6 . In the 
case of S = 0, i.e., states with no relative displacement 
and the same temperature, the minimization over s can 
be done analytically, and one finds s* = 1/2. The quan- 
tum Chernoff measure becomes: 



Q = 



1 



coshi? 
cosh 2 (r —ri) + sin 2 ( 



L ) sinh 2r sinh 2r 1 1 



(99) 

-1/2 



Notice that this expression is independent of the temper- 
ature (or purity) of the states. That is, the distinguisha- 
bility of two arbitrary Gaussian states with no relative 
displacement and equal temperature is independent of 
the degree of mixedness of the states. 



Chernoff metric for Gaussian states 



Following the definition (|40[) and using the previous 
results we find that Chernoff metric is 



ds 



dp 2 



dr z + d(j) z sinli 2r 



QC 



32 sinh 2 | 

„-2r dq 2 



tf-M t anh^ 
2 4' 



(100) 



where we have defined the rotated displacement variables 
{q<p,P(t>) = {l,p)0 ( f > and we have used that for infinitesi- 
mal changes s* = 1/2. We find again that the metric is 
independent of the temperature under variations of the 
squeezing parameters r and <j>. 

The (unnormalized) quantum Jeffreys prior can be ob- 
tained from the metric tensor: 



Vqc{p) oc \f\detg\ = 



1 tanh/?/4 
16 V2 sinh/3/2 



sinh2r. (101) 



The metric induced by the local measure on the set of 
mixed states is given by one-half the Bures metric 7 



dsQ C 



d(3 2 



r d P 2 d 



32 sinh 2 | 



tanh 



dr 2 + dcj) 2 sinh 2 2r 



(1 + sech/3). 



(102) 



We note that, dsi 



dq C —r jrfsqc as P approaches the set of 
pure states (/3 — > oo) along the lines d/3 = 0, in agree- 
ment with the general statement at the end of Sec. IV Bl 
In the limit of very mixed states (/3 w 0) the quantum 
Chernoff and local metric coincide up to first order in (3. 
In this limit of high temperatures (/3 ~ 0, highly mixed 
states) the quantum Chernoff metric and Jeffreys prior 
agree with those derived from Bures distance (modulo 
the omnipresent factor 1/2). In particular this implies 
that the analysis in [54j of the Bures volume element in 
this high temperature regime also applies here. 



VIII. SUMMARY AND CONCLUSIONS 

We have analyzed quantum state discrimination (sym- 
metric hypothesis testing) and the classical and quantum 
Chernoff bound focussing on the link between them and 
the concept of measures (distances) and metrics on the 
space of quantum states. More precisely, we have been 
concerned with defining measures and metrics that have 
a clear operational meaning, so that they can as a mat- 
ter of principle be obtained from experiments. The error 
probability in state discrimination, or rather its asymp- 
totic rate exponent (error exponent), has been shown to 
provide the natural link. Thus, the concept of distin- 
guishability measure has emerged and has been analyzed 
in depth throughout the central part of this work. Before 
doing so, we have reviewed the methods and the main re- 
sults of classical and quantum hypothesis testing in the 
first three sections of the paper. Qubit and Gaussian 
states have provided two excellent, very relevant exam- 
ples to illustrate our results in the last sections. 

Our main points and results are summarized as follows: 
The quantum Chernoff bound gives an upper bound to 
the error probability in state discrimination. When the 
unknown state (which we are asked to identify as cither 
one or the other of two known states) is a tensor prod- 
uct, corresponding to many identical copies, the quan- 
tum Chernoff information (which is essentially the log 
of the quantum Chernoff bound) gives the error expo- 
nent of the optimal discrimination protocol. We propose 
this quantity as a distinguishability measure for general 
mixed states. We show that the quantum Chernoff mea- 
sure is not attainable by protocols that use local fixed 



In contrast to the claims in Exercise 3.9 page 77 of |17|1 , it is not 
generally the case that for states with equal spectra the minimum 
of Q s is reached for s* = 1/2. 



7 There seems to be a typo in [53l | in the contribution of small 
displacements of Eq. (13). 
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measurements (those for which the same measurement is 
performed on each of the individual copies). Given the 
practical relevance of these types of protocols (they can 
be realized with current technology) , we define a local dis- 
tinguishability measure as the error exponent of the best 
such protocol and present its main features. We derive 
the metrics induced by these measures and their corre- 
sponding volume elements. The latter provide a means to 
define operational prior probability distributions of den- 
sity matrices. We derive them for general matrices of 
arbitrary dimension. 

Examples of all the above are given in the last part 
of the paper. For qubit and Gaussian states, we give 
explicit formulas for the distinguishability measures and 
their corresponding metrics and volume elements. We 
give a geometrical picture of the space of qubit states 
based on those metrics. This space can be viewed as 
a spherical cap, similar to Uhlmann hemisphere, with 
the pure states sitting on the rim. These examples also 



illustrate the fact that the quantum Chernoff measure, 
besides being the most natural distance between general 
states, is conveniently easy to compute relative to other 
distances, such as the widely used fidelity. 
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